Robust Speech Recognition Using Factorial HMMs for Home Environments
نویسندگان
چکیده
We focus on the problem of speech recognition in the presence of nonstationary sudden noise, which is very likely to happen in home environments. As a model compensation method for this problem, we investigated the use of factorial hidden Markov model (FHMM) architecture developed from a clean-speech hidden Markov model (HMM) and a sudden-noise HMM. While in conventional studies this architecture is defined only for static features of the observation vector, we extended it to dynamic features. In addition, we performed home-environment adaptation of FHMMs to the characteristics of a given house. A database recorded by a personal robot called PaPeRo in home environments was used for the evaluation of the proposed method. Isolated word recognition experiments demonstrated the effectiveness of the proposed method under noisy conditions. Home-dependent word FHMMs (HD-FHMMs) reduced the word error rate by 20.5% compared to that of the clean-speech word HMMs.
منابع مشابه
Environmental Sound Source Identifica Model for Robust Speech
In real acoustic environments, humans communicate with each other through speech by focusing on the target speech among environmental sounds. We can easily identify the target sound from other environmental sounds. For hands-free speech recognition, the identification of the target speech from environmental sounds is imperative. This mechanism may also be important for a selfmoving robot to sen...
متن کاملFactorial HMMs for acoustic modeling
Despite the success of hidden Markov models (HMMs) and other techniques for speech recognition, there remains a wide perception in the speech research community that new ideas are needed to continue improvements in performance. This paper represents a contribution to this effort. We describe preliminary experiments using an alternative modeling approach known as factorial hidden Markov models (...
متن کاملGeneralized Variable Parameter HMMs for Noise Robust Speech Recognition
Handling variable ambient noise is a challenging task for automatic speech recognition (ASR) systems. To address this issue, multi-style, noise condition independent (CI) model training using speech data collected in diverse noise environments, or uncertainty decoding techniques can be used. An alternative approach is to explicitly approximate the continuous trajectory of Gaussian component mea...
متن کاملLinear regression based Bayesian predictive classification for speech recognition
The uncertainty in parameter estimation due to the adverse environments deteriorates the classification performance for speech recognition. It becomes crucial to incorporate the parameter uncertainty into decision so that the classification robustness can be assured. In this paper, we propose a novel linear regression based Bayesian predictive classification (LRBPC) for robust speech recognitio...
متن کاملImproving lip-reading performance for robust audiovisual speech recognition using DNNs
This paper presents preliminary experiments using the Kaldi toolkit [1] to investigate audiovisual speech recognition (AVSR) in noisy environments using deep neural networks (DNNs). In particular we use a single-speaker large vocabulary, continuous audiovisual speech corpus to compare the performance of visual-only, audio-only and audiovisual speech recognition. The models trained using the Kal...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- EURASIP J. Adv. Sig. Proc.
دوره 2007 شماره
صفحات -
تاریخ انتشار 2007